Sparse Submodular Probabilistic PCA
نویسندگان
چکیده
We propose a novel approach for sparse probabilistic principal component analysis, that combines a low rank representation for the latent factors and loadings with a novel sparse variational inference approach for estimating distributions of latent variables subject to sparse support constraints. Inference and parameter estimation for the resulting model is achieved via expectation maximization with a novel variational inference method for the E-step that induces sparsity. We show that this inference problem can be reduced to discrete optimal support selection. The discrete optimization is submodular, hence, greedy selection is guaranteed to achieve 1-1/e fraction of the optimal. Empirical studies indicate effectiveness of the proposed approach for the recovery of a parsimonious decomposition as compared to established baseline methods. We also evaluate our method against state-of-the-art methods on high dimensional fMRI data, and show that the method performs as well as or better than other methods.
منابع مشابه
Sparse Probabilistic Principal Component Analysis
Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it is not easy to interpret which of the original features are important based on the principal components. Recent methods improve interpretability by sparsifying PCA through adding an L1 regularizer. In this paper, we introduce a probabilistic formulation for sparse PCA. By presenting sparse PCA as a p...
متن کاملInformation Projection and Approximate Inference for Structured Sparse Variables
Approximate inference via information projection has been recently introduced as a generalpurpose technique for efficient probabilistic inference given sparse variables. This manuscript goes beyond classical sparsity by proposing efficient algorithms for approximate inference via information projection that are applicable to any structure on the set of variables that admits enumeration using ma...
متن کاملGlobally Sparse Probabilistic PCA
With the flourishing development of highdimensional data, sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features in an unsupervised manner. However, when several sparse principal components are computed, the interpretation of the selected variables may be difficult since each axis has its own sparsity pattern and...
متن کاملStructured Convex Optimization under Submodular Constraints
A number of discrete and continuous optimization problems in machine learning are related to convex minimization problems under submodular constraints. In this paper, we deal with a submodular function with a directed graph structure, and we show that a wide range of convex optimization problems under submodular constraints can be solved much more efficiently than general submodular optimizatio...
متن کاملExtensions of probabilistic PCA
Principal component analysis (PCA) is a classical data analysis technique. Some algorithms for PCA scale better than others to problems with high dimensionality. They also differ in the ability to handle missing values in the data. In our recent paper [1], a case is studied where the data are high-dimensional and a majority of the values are missing. In the case of very sparse data, overfitting...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015